Linear-algebraic list decoding of folded Reed-Solomon codes 



Venkatesan Guruswami* 



o 

(N 
G 

(N 



Computer Science Department 
Carnegie Mellon University 
Pittsburgh, PA 15213 



Abstract 



Folded Reed-Solomon codes are an explicit family of codes that achieve the optimal trade- 
off between rate and error-correction capability: specifically, for any e > 0, the author and 
Rudra (2006,08) presented an n ' 1 /^ time algorithm to list decode appropriate folded RS codes 
of rate R from a fraction 1 — R — e of errors. The algorithm is based on multivariate polynomial 
interpolation and root-finding over extension fields. It was noted by Vadhan that interpolating 
a linear polynomial suffices if one settles for a smaller decoding radius (but still enough for a 
statement of the above form). Here we give a simple linear-algebra based analysis of this vari- 
ant that eliminates the need for the computationally expensive root-finding step over exten- 
sion fields (and indeed any mention of extension fields). The entire list decoding algorithm is 
■ linear-algebraic, solving one linear system for the interpolation step, and another linear system 

to find a small subspace of candidate solutions. Except for the step of pruning this subspace, 
the algorithm can be implemented to run in quadratic time. 

^vO . The theoretical drawback of folded RS codes are that both the decoding complexity and 

proven worst-case list-size bound are n ^ 1 / 6 '. By combining the above idea with a pseudo- 
random subset of all polynomials as messages, we get a Monte Carlo construction achieving 
a list size bound of 0(l/e 2 ) which is quite close to the existential 0(l/e) bound (however, the 
decoding complexity remains n^ 1 / 6 )). 

, Our work highlights that constructing an explicit subspace-evasive subset that has small in- 

tersection with low-dimensional subspaces — an interesting problem in pseudorandomness in 
its own right — could lead to explicit codes with better list-decoding guarantees. 



1 Introduction 

Reed-Solomon (RS) codes are an important family of error-correcting codes with many applica- 
tions in theory and practice. An [n, k] q RS code over the field ¥ q with q elements encodes poly- 
nomials / € of degree at most k — 1 by its evaluations at n distinct elements from ¥ q . The 
encodings of any two distinct polynomials differ on at least n — k + 1 positions, which bestows the 
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RS code with an error-correction capability of {n — k)/2 worst-case errors. Classical algorithms, 
the first one due to Peterson [20] over 50 years ago, are able to decode such a RS code from up to 
(n — k)/2 errors (i.e., a fraction (1 — R)/2 of errors where R = k/n is the rate code) in polynomial 
time. 

Decoding beyond the radius (1 — R)/2 is not possible if the decoder is required to always 
identify the correct message unambiguously. However, allowing the decoder to output a small list 
in the worst-case enables decoding well beyond this bound. This notion is called list decoding, 
and has been an actively researched topic in the last decade. It has found many applications in 
complexity theory and pseudorandomness (see [23, 24, 26] for some surveys) beyond its direct 
relevance to error-correction and communication. 

For RS codes, Sudan [22] gave a list decoding algorithm to decode beyond the (1 — R)/2 radius 
for rates R < 1/3. For rates R — > 0, the algorithm could correct a fraction of errors approaching 1, a 
remarkable feature that led to many complexity-theoretic applications. The author and Sudan [14] 
improved the error-correction radius to 1 — \fR, matching the so-called "Johnson radius," which is 
the a priori lower bound on list-decoding radius of a code as a function of its distance alone. This 
result improved upon the traditional (1 — R)/2 bound for all rates. The 1 — \fR bound remains the 
best error-correction radius achievable to date for list decoding RS codes. 

A standard random coding argument, however, shows the existence of rate R codes CCS™ 
list-decodable even up to radius 1 — R — e. Specifically, C has the combinatorial property that for 
every y £ S n , there are at most L = 0(l/e) codewords of C within Hamming distance (1 — R — s)n 
from y. Here e > can be an arbitrarily small constant. The quantity L is referred to as the list- 
size. Note that 1 — R is a clear information-theoretic limit for error-correction, since at least Rn 
received symbols must be correct to have any hope of recovering the Rn message symbols. 

A few years back the author and Rudra, building upon the work of Parvaresh and Vardy [19], 
gave an explicit construction of codes of rate R which are list-decodable in polynomial time up to 
radius 1 — R — e, with a list-size of n°^ l / e ^ [13]. These codes were a "folded" version of Reed- 
Solomon codes, defined below. 

Definition 1 (m-folded Reed-Solomon code). Let 7 e F q be a primitive element of¥ q . Let n ^ q — 1 
be a multiple of m, and let 1 ^ k < nbe the degree parameter. 

The folded Reed-Solomon (FRS) code FRS g m ^ [n, k] is a code over alphabet that encodes a polynomial 
f G ¥ q [X] of degree k - 1 as 1 
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Observe that the FRS code has block length N = n/m and rate R = k/n (equal to the rate of the 
original, unfolded Reed-Solomon code, which corresponds to the choice m = 1). For any integer 

1 The actual code depends also on the choice of the primitive element 7. But the results hold for any choice of 
primitive 7, so for notational convenience we suppress the dependence on 7 and assume some canonical choice of 7 is 
fixed. 
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/ \s/(s+l) 

s, 1 ^ s ^ m, a list decoding algorithm for the above FRS codes for a fraction « 1 — ( m ™g +1 J 

of errors is presented in [13], with decoding complexity and list-size q s . The result of [13] can 
also be viewed as a better algorithm for decoding Reed-Solomon codes when the errors occur in 
bursts, since the evaluation points of the RS encoding are usually ordered as powers of 7 for some 
primitive 7. 

For suitably large constants s, m depending on e, the above list decoding radius for FRS codes 
exceeds 1 — R — e. However, the list-size bound then becomes n n( - 1 / £ ' ) which has a rather poor 
dependence on the distance e to the optimal trade-off. Improving the list-size is therefore an 
important open question. Recall that existentially a list-size as small as 0(1 /e) is possible. The 
decoding algorithms in [19, 13] consist of two steps (see Section 2.1 for more details): (i) multivari- 
ate polynomial interpolation (to find an algebraic equation that candidate message polynomials / 
must satisfy), and (ii) solving this equation via root-finding over extension fields. The interpola- 
tion step reduces to finding a nonzero solution to a homogeneous linear system, and theoretically 
the second step is the computationally more expensive one. 

Vadhan showed recently that a weaker decoding radius (which however still suffices to list 
decode up to radius 1 — R — e) can be achieved by a simplified interpolation step that only inter- 
polates a degree 1 multivariate polynomial [25]. Further, there is no need to use multiplicities in 
the interpolation as in the earlier algorithms [14, 19, 13]. 2 This offers a clean and simple exposition 
of a list decoding algorithm for FRS codes (that can be viewed as a multidimensional version of 
the Welch-Berlekamp decoder for RS codes) for a fraction ^-(l — m ™^ +1 ) of errors [10] (see 
Section 2.2). The second root-finding step of the decoder, however, remained unchanged. 

Contributions of this work. Here, we note that this Welch-Berlekamp style "degree 1" list de- 
coder, not only offers a simpler exposition, but also offers some promising advantages. Our start- 
ing point is the simple observation that in this case the candidate solutions to the algebraic equa- 
tion form an affine subspace (of the full message space ¥*:). This implies that the second step of 
the list decoding can also be tackled by solving a linear system! 

By inspecting the structure of this linear system, we give an elementary linear-algebraic proof 
(Lemma 6) that the subspace of solutions has dimension at most s— 1, a fact that was earlier proved 
by root counting over extension fields in [13, 25]. This shows that the exponential dependence in s 
of the list-size bound was inherently because of the dimension of the interpolation (and it wasn't 
crucial that we had the identity f(j s ~ 1 X) = f(X) qS 1 over some extension field 3 ). 

The linear-algebraic proof also gives a quadratic time algorithm to find a basis for the subspace 
(instead of the cubic time of Gaussian elimination). This leads to a quadratic runtime for the list 
decoder, except for the final step of pruning the subspace to actually find the close-by codewords 
(formal statement in Theorem 7). This pruning step needs to check each element of the subspace 
and thus unfortunately could still take q s time. However, in practice (or when errors occur ran- 
domly), the dimension of the output subspace will likely be very small, probably even (implying 
a unique solution), and in such cases we get significant gains in efficiency compared to [13]. 



2 However, the method of multiplicities is still crucial if one wants a soft-decision list decoder, which, at least for 
Reed-Solomon codes, has been a very influential development [17], with many subsequent papers looking at practical 
decoding architectures. 

3 This identity, however, seems to be the only known way to bound the list-size when higher degrees are used in the 
interpolation. 
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Better list-size via subspace-evasive sets. Our second contribution is to exploit the subspace 
structure of the candidate solutions to improve the list-size bound. The idea is to restrict the co- 
efficient vectors of the message polynomial to a large "subspace-evasive" subset that has small 
intersection with subspaces of low dimension. Subspace-evasive sets seem like fundamental com- 
binatorial objects interesting in their own right. They are related to affine extractors, and also 
have applications to constructing bipartite Ramsey graphs [21]. As one would expect, a random 
set has excellent subspace-evasiveness, but finding good explicit constructions is wide open. Our 
application to list decoding in this work provides another motivation for the interesting problem 
of constructing subspace-evasive sets. 

Using a pseudorandom construction of subspace-evasive subsets (in fact, algebraic varieties) 
based on limited independence, we give a Monte Carlo construction (succeeding with high prob- 
ability) of rate R codes list-decodable up to a fraction 1 — R — e of errors with a list-size of 0(1/ e 2 ) 
(Theorem 10 gives the exact statement). Due to the pruning step, the worst-case runtime is how- 
ever still n n ^/ £ \ Nevertheless, this is the first construction with a better than n fi ( 1/,£ ' list-size for 
decoding up to the information-theoretic limit of 1 — R — s fraction of errors. 

For this construction, we do not know a polynomial time computable encoding function that 
maps messages to polynomials in the subspace-evasive subset. However, if we settle for a list-size 
of 0(n) — still much better than the earlier n^f 1 ^ bound — a polynomial time encoder can also 
be obtained. We stress that only our code construction is randomized, and once it succeeds (which 
happens w.h.p.), the list decoding properties hold for every received word and the encoding and 
list decoding procedures run in deterministic polynomial time. 

Organization. We describe the list decoding algorithm for FRS codes and our linear-algebraic 
analysis of it in Section 2. We make some related remarks about the linear algebra approach in 
Section 3. We use subspace-evasive sets to give our Monte Carlo construction of codes achieving 
list decoding capacity with improved list-size in Section 4. 



2 List decoding folded Reed-Solomon codes 



Suppose a codeword of the m-folded RS code (Definition 1) was transmitted and we received a 
string in y G (FJ?) which we view as an m x N matrix over ¥ q : 

( yo y m yn-m+i \ 

yi y m +i : 
y-i y m +2 ■ 



\ Vm-l 



(2) 



Vn-1 ) 



We would like to recover a list of all polynomials / G F 9 [^C] of degree k — 1 whose folded 
RS encoding (1) agrees with y in at least N — e columns, for some error bound e. Note that an 
agreement means that all m values in that particular column match. The following theorem is 
from [13]. 

Theorem 2. For every integer s, 1 ^ s ^ m and any constant 8 > 0, there is a list decoding algorithm for 
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the folded Reed-Solomon code FRS q m ^ [n, k] that list decodes from up to e errors as long as 

'^(m-s+l)) 1 /^ 1 ) 



e^N-(l + 6)- 



m — s + 1 



where N = n/m is the block length of the code. The algorithms runs in {Os{q)) 0( - s ^ time and outputs a list 
of size at most q s . 

Note that the fraction of errors corrected by this algorithm as a function of the rate R = k/n = 

k/(Nm) is 

D \ S/(S+1) 

rati N ' v ' 



l-(l + <5) — ■ (3) 

\m — s + 1 J 

By picking S « s, s « 1/e and m« 1/e 2 , the above quantity is at least 1 — R — s, and the decoding 
complexity and list size are ps q°0-/ s \ 



2.1 Overview of above decoding algorithm 

We briefly recap the high level structure of this decoding algorithm. The quantity s is a pa- 
rameter of the algorithm. In the first step, the algorithm interpolates a multivariate polynomial 
Q G ¥ q [X,Yi,Y2, . . . ,Y S ] of low weighted degree (where the Y/s have weight k — 1 and X has weight 
1) such that, for every i, < i < n - s, Q(X, Y\,...,Y a ) vanishes at (7', yi, y i+ x, ... , y;+ s -i) € F^ +1 
with high multiplicity (related to the other parameter 5 of the algorithm) . This step can be accom- 
plished by solving a homogeneous linear system over ¥ q . The degree and multiplicity parameters 
in the interpolation step are carefully picked to ensure the following two properties: (i) a nonzero 
Q meeting the interpolation requirements exists, and (ii) every / G of degree at most (k — 1) 

whose FRS encoding agrees with y on at least N — e places (and which, therefore, must be output 
by the list decoder) satisfies the functional equation 

Q(xj(x)j( 7 x),--- j(Y- 1 x))=o. 

In the second step of the decoder, all solutions / to the above equation are found. This is done 
by observing that f(>yX) = f(X)i (mod E(X)) where E{X) = (X^ 1 - 7), and therefore / 
mod E(X) can be found by finding the roots of the univariate polynomial 

T(Y) = Q(X,Y,Y (1 ,...,Y^ 1 ) modE(X) 

with coefficients from L = ¥ q [X]/(E(X)). The polynomial E(X) is irreducible over ¥ q and there- 
fore L is an extension field. The parameter choices ensure that T / 0, and thus T cannot have too 
many roots and these roots may all be found in polynomial time. Finally, this list is pruned to only 
output those polynomials whose FRS encoding is in fact close to the received word y. 



2.2 A Welch-Berlekamp style interpolation 

We will now describe a variant of the above scheme where the interpolation step will fit a non- 
zero "linear" polynomial Q(X, Y"i, Y2, . . . , Y s ) (with degree 1 in the Y/s). This can be viewed as a 
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higher-dimensional generalization of the Welch-Berlekamp algorithm [27, 8]. This elegant version 
is due to Vadhan and is described in his monograph [25, Chap. 5] (and also used in the author's 
lecture notes [10]). For completeness, and because it will be convenient to refer to it in the second 
step, we give a self-contained presentation here. 

The original motivation for this variant was that it had simpler parameter choices and an easier 
exposition (even though the error-correction guarantee worsened, it still allowed approaching a 
decoding radius of 1 — R in the limit). In particular, it has the advantage of not requiring the use 
of multiplicities in the interpolation. (Essentially, the freedom to do s-variate interpolation for 
a parameter s of our choosing allows us to work with simple interpolation while still gaining in 
error-correction radius with increasing s. This phenomenon also occurred in one of the algorithms 
in [12] for list decoding correlated algebraic-geometric codes.) 

In this work, our contribution is to put the simple linear structure of the interpolated poly- 
nomial to good use and exploit it to substitute the root-finding step with a more efficient step of 
solving a linear system. 

Given a received word as in (2) we will interpolate a nonzero polynomial 

Q(X,Y U Y 2 , ...,Y S ) = A (X) + A 1 (X)Y 1 + A 2 (X)Y 2 + • • • + A S (X)Y S (4) 

over F q with the degree restrictions deg(^) ^ D for i = 1,2, ... ,s and deg(Ao) ^ D + k—1, where 
the degree parameter D is chosen to be 

_ N(m -s + l)-k + l 

~ [ 7+1 

The number of monomials in a polynomial Q with these degree restrictions equals 

(D + l)s + D + k = (D + l)(s + 1) + k - 1 > N(m - s + 1) (6) 

for the above choice (5) of D. The interpolation requirements on Q e ¥ q [X,Yi, . . . , Y s ] are the 
following: 

Q(Y m+j ,y im +j,yim+j+i, ■■■ , yim+j+s-i) = for i = 0, 1, . . . , n/m - 1, j = 0, 1, . . . , m - s . (7) 

Since the number of interpolation conditions (n/m) ■ (m — s + 1) is less than the number of degrees 
of freedom (monomials) in Q, we can conclude the following. The claim about the near-linear 
runtime has been shown in [4] (see Proposition 5.11 in Chapter 5). 

Lemma 3. A nonzero Q G ¥ q [X,Yi, . . . ,Y S ] of the form (4) satisfying the interpolation conditions (7) 
can be found by solving a homogeneous linear system over ¥ q with at most Nm constraints and variables. 
Further this interpolation can be performed in O (Nm log 2 (Nm) loglog(iVm)) operations over ¥ q . 

The following lemma shows that any such polynomial Q gives an algebraic condition that the 
message polynomials f(X) we are interested in list decoding must satisfy. 

Lemma 4. If f € ¥[X] is a polynomial of degree at most k — 1 whose FRS encoding (1) agrees with the 
received word y in at least t columns for t > ^-a+i ' ^ en 

Q(X, f(X),f( 7 X), /(7 s " 1 *)) = . (8) 



(5) 
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Proof. Define R(X) = Q(X, f(X),f(jX), . . . , f(j s ^X)). Due to the degree restrictions on Q, the 
degree of R(X) is easily seen to be at most D + k — 1. If the FRS encoding of / agrees with y in the 
i'ih column (for some i € {0, 1, . . . , N — 1}), we have 

7(7 )=yim, 7(7 J = 2/im+l, ••• ,7(7 J = Uim+m-l ■ 

Together with the interpolation conditions (7), this implies R(^ m+ i) = for j = 0, 1, . . . , m — s. In 
other words i? picks up at least m — s + 1 distinct roots for each such column i. Thus R must have 
at least t(m — s + 1) roots in all. Since deg(7?) ^ Z? + k — 1, if i > (£) + A; — l)/(m — s + 1), we must 
have R = 0. □ 

For the choice of .D in (5), the requirement on t in Lemma 4 is met if t(m— s + 1) > jV ( m ~ g +i)+^( fc ~ 1 ) / 
and hence if 

iV s k _ / 1 s mi? \ 

4 ^ + = AT + . (9) 

s + 1 s + lm — s + 1 \s + 1 s + lm-s + 1/ 

In other words, the fractional agreement needed is j^j + ^rx m -i+i • Note that by the AM-GM 
inequality this agreement is always higher than the agreement fraction I ^jl 1 ) needed in 



m—s+l 

(3). 4 Thus this variant corrects a smaller fraction of errors. Nevertheless, with the choice s w 1/e 
and m ~ 1/e 2 , the fraction of errors corrected can still exceed 1 — R — e. Further, as we see next, it 
offers some advantages when it comes to retrieving the solutions / to (8). 



2.3 Retrieving candidate polynomials / 

By the preceding section, to complete the list decoding we need to find all polynomials / G F Ci [X] 
of degree at most k — 1 that satisfy 

MX) + At(X)f(X) + A 2 (X)f( 7 X) + ■■■ + AsWfi^-'X) = . (10) 



We note the following simple but very useful fact: 

Observation 5. The above is a system of linear equations over F q in the coefficients /o, /i, • • • , fk-i of the 
polynomial f(X) = /o + f\X + • • • + fk-\X k ~ l . Thus, the solutions (/o, /i, • • • , fk-i) of (10) form an 
affine subspace ofF*. 

In particular, the above immediately gives an efficient algorithm to find a compact represen- 
tation of all the solutions to (10) — simply solve the linear system! This simple observation is the 
starting point driving this work. 

We next prove that when 7 is primitive, the space of solutions has dimension at most s — 1. 
Note that we already knew this by the earlier argument over the extension field F (? [X]/(A A<?_1 — 7). 
But it is instructive to give a direct proof of this working only over F q . The proof in fact works 
when the order of 7 is at least k. Further, it exposes the simple structure of the linear system which 
can be used to find a basis for the solutions in quadratic time. 

4 Recall that for Reed-Solomon codes (m = 1) this was also exactly the case: the classical algorithms unique decoded 
the codeword when the agreement fraction was at least and the list decoding algorithm in [14] list decoded from 
agreement fraction s/R. 
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Lemma 6. If the order of "7 is at least k (in particular when 7 is primitive), the affine space of solutions to 
(10) has dimension d at most s — 1. 

Further, one can compute using 0((Nm) 2 ) field operations over ¥ q a matrix M G F^ xd (for some 
d ^ s — 1) and a vector zeFj such that the solutions are contained in the affine space Mx + zfor x e F^. 
Also, the matrix M can be assumed to have the d x d identity matrix as a submatrix (without any extra 
computation). 

Proof. First, by factoring out a common powers of X that divide all of Ao(X), A\(X), . . . , A S (X), 
we can assume that at least one Ai* (X) for some i* € {0, 1, . . . , s} is not divisible by X, and has 
nonzero constant term. Further, if A±(X), . . . , A S (X) are all divisible by X, then so is Aq(X), so 
we can take i* > 0. 

Let us denote A { (X) = Efj^ -1 o^-A? for sC i < s. (We know that the degree of Ai(X) for 
i ^ 1 is at most D, so aij = when i ^ 1 and j > D, but for notational ease let us introduce these 
coefficients.) Define the polynomial 

B(X) = oi i0 + a 2fi X + a 3 , A 2 + • • • + a S)0 X s_1 . 

We know that a^o / 0, and therefore 

We will prove our upper bound on the rank of the solution space by examining the condition 
that the coefficients of X r of the polynomial 

A(X) = A (X) + At(X)f(X) + A 2 (X)f( 1 X) + ■■■ + A s {X)f{Y- l X) 

on the left hand side of (10) equals for r = 0, 1, 2, ... . 

The constant term of A(X) equals ao,o + ai,o/o + ^2,0/0 + • • • + a s ,o/o = a o,o + -B(l)/o- Thus if 
B(l) 7^ 0, then fo is uniquely determined as — clq$/B(1). If B(l) = 0, then ao,o = or else there 
will be no solutions to (10) and in that case /o can take an arbitrary value in ¥ q . 

The coefficient of X r of A(X) equals 

ao,r + fr ■ (oi |0 + a 2 ,o7 r + • • • + a s ,07 (s " 1)r ) + fr-l " Kl + «2,i7 r_1 + ■ ■ ■ + as,i7 (s " 1)(r_1) )+ (H) 

h /l • (ai,r- 1 + 02,r-l7 H 1" as,r-l7 S_1 ) + /o • («l,r H h Oa,r) 

r-1 

= J B(7 r )/r+(E ft ! r) / J )+«0,r- (12) 
i=0 

for some coefficients G F g . The linear form (12) must thus equal 0. The key point is that if 
B(Y) / 0, then this implies that f r is an affine combination of /o,/i, • • • , fr-l and in particular is 
uniquely determined given values of /o, /1, . . . , / r -i- 

Thus the dimension of the space of solutions is at most the number of r, ^ r < k, for which 
B(ry r ) = 0. Since 7 has order at least k, the powers 7 r for ^ r < k are all distinct. Also we know 
that i? is a nonzero polynomial of degree at most s — 1. Thus B(Y) = for at most s — 1 values of 
r. 

We have thus proved that the solution space has dimension at most s — 1. The claim about 
quadratic complexity and the structure of the matrix M follows since the equations (12) of the 
linear system have a simple "lower-triangular" form. □ 
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Combining Lemmas 3 and 6 and the decoding bound (9), we can conclude the following. 

Theorem 7. For the folded Reed-Solomon code FRS^ [n, k] of block length N = n/m and rate R = k/n, 
the following holds for all integers s, 1 < s < m. Given a received word y G (¥™) N , in 0((Nmlogq) 2 ) 
time, one can find a basis for a subspace of dimension at most s — 1 that contains all message polynomials 
f G F g [X] of degree less than k whose FRS encoding (1) differs from y in at most a fraction 

s / mR \ 

s + 1 \ m — s + 1 J 

of the N codeword positions. 

Note : When s = m = 1, the above just reduces to a unique decoding algorithm up to a fraction 
(1 - R)/2 of errors. 

Comment on runtime and list size. To get the actual list of close-by codewords, one can prune the 
solution subspace, which unfortunately may take q s time in the worst-case. This quantity is about 
n o(i/e) £ or me p arame ter choices which achieve a list decoding radius of 1 — R — e. Theoretically, 
we are not able to improve the worst-case list size bound of « re 1 ' e in this regime. This motivates 
our results in Section 4 where we show that using a carefully chosen subset of all possible degree 
k — 1 polynomials as messages, one can ensure that the list-size is much smaller while losing only 
a tiny amount in the rate. 

Except for final step of pruning the subspace of candidate solutions, the decoding takes only 
quadratic time (and is perhaps even practical, as it just involves solving two structured linear sys- 
tems). In practice, for example when errors occur randomly, the dimension of the output subspace 
will likely be very small, probably even leading to a unique solution. If some side information 
about the true message / is available that disambiguates the true message in the list [9], that might 
also be useful to speed up the pruning. 

3 Some further comments about the proof method 

We now make some salient remarks about the above linear-algebra based method to retrieve the 
space of polynomials /. 

Tightness of g s_1 bound. The upper bound of q s ~ l on the number of solutions / to the Equation 
(10) cannot be improved in general. Indeed, let A$ = 0, and Ai for 1 ^ i ^ s be the coefficients of 
Y^ 1 in the polynomial (Y - 1)(Y - 7) • • • (Y - 7 s " 2 ). Then for ^ I ^ s - 2, we have 

A t X e + A 2 ( 1 X) e + ■■■ + Asi^xY = X e ■ (A, + A 2l £ + A 3 (^) 2 + ■■■ + ^(t') 5 " 1 ) = . 

By linearity, every polynomial / G F 9 [X] of degree at most s — 2 satisfies (10). We should add that 
this does not lead to any non-trivial list-size lower bound for decoding folded RS codes, as we do 
not know if such a bad polynomial can occur as the output of the interpolation step, and moreover 
the pruning step could potentially reduce the size of the list further. 

Requirement on 7. The argument in Lemma 6 only required that the order of 7 is at least k, and 
not that 7 is primitive. The polynomial X q ~ l — 7 is irreducible if and only if 7 is primitive, and 
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therefore the approach based on extension fields discussed in Section 2.1 requires 7 to be primitive. 
Usually in constructions of Reed-Solomon codes, one takes the block length n ~ q and therefore 
the dimension k is linear in q (for constant rate codes). So this weakened requirement on 7 does 
not buy much flexibility in this case. However, if for some reason, one uses RS codes over much 
larger fields, then the new argument applies to a broader set of choices of evaluation points for 
the RS codes. 

Linear (instead of affine) space of solutions. With a slight worsening of parameters, we can 
ensure that the space of solutions is in fact a linear space of dimension at most s — 1, instead of 
the affine space ensured by Lemma 6. The idea is to not use Aq{X) in the interpolation (or rather 
set A) = 0), so that Q(X, Y u Y 2 , . . . , Y s ) = A 1 (X)Y 1 + A 2 {X)Y 2 + ■■■ + A S (X)Y S . With the degree 
of each A4 equal to D, this gives us (D + l)s monomials in Q, and therefore condition (6) that 
guarantees the existence of a nonzero Q meeting the interpolation requirements (7) now becomes 
{D + l)s > N(m — s + 1). Thus one can take D = N( J rL ~ s + l ) _ Th e condition t(m — s + 1) ^ 
D + k that enables successful list decoding is thus met when the agreement parameter t satisfies 
t ^ t + m-s+i = N ( ~ + m ™^ +1 ^J • This is slightly worse than (9), but still allows for decoding 
from agreement (R + e)N by setting s pa 1/e and m pa 1/e 2 . 

Hensel lifting. An alternate approach (to root-finding over extension fields) for finding the low- 
degree solutions / to the equation Q(X, f(X),f(jX), . . . , f(j s ~ 1 X)) = is based on Hensel- 
lifting. Here the idea is to solve for / mod X 1 for i = 1,2, ... in turn. For example, the con- 
stant term fo of f(X) must satisfy Q(0, fo, fo, . . . , /o) = 0. If Q(0, Y, Y, . . . , Y) is a nonzero poly- 
nomial, then this will restrict the number of choices for Jq. For each such choice fo, solving 
Q(X, f(X), . . . , f('j s ~ l X)) mod X 2 = gives a polynomial equation for f\, and so on. This 
approach is discussed in [1] and [4, Chap. 5]. It is mentioned that this algorithm is very fast 
experimentally and almost never explores too many candidate solutions. A similar approach was 
also considered in [16] for folded versions of algebraic-geometric codes. However, theoretically 
it has not been possible to derive any polynomial guarantees on the size of the list returned by 
this approach or its running time (the obvious issue is that in each step there may be more than 
one candidate value of fi, leading to an exponential product bound on the runtime). Polynomial 
bounds in special cases (eg. when s = 2) are presented in [4], and obtaining such theoretical 
bounds is posed as an interesting challenge for future work. Our Lemma 6 provides an analysis 
of the Hensel-lifting approach when the interpolated polynomial is linear in the Y/s. 

Additive folding? Let p be a prime. Over ¥ p , one can also consider additive folding schemes, where 
the value /(a) is bundled together with /(a+1), /(a+2), . . . , f(a+m— 1), in a construction similar 
to (1). The approach using extension fields can be used to show that the number of polynomials 
/ G ¥ P [X] of degree less than p satisfying Q(X, f{X), f(X + 1), . . . , f(X + s - 1)) = for Q e 
¥ P [X, Y\, Y 2 , . . . , Y s ] that is linear in the Y/s is at most p s ~ l . This follows by going modulo the 
polynomial X p — X — l which is irreducible over ¥ p and noting that f(X + 1) = f(X) p mod (X p — 
X — l). 5 Is there a linear-algebraic proof similar to Lemma 6 for this case? The map f(X) 1— > f(jX) 
acts diagonally on the standard basis {1, X, ... , X k ~ 1 } for degree k — 1 polynomials, and this led 
to the nice structure for the linear system (10). The linear transformation f(X) i-)- f(X + 1) is 
not diagonalizable so the upper bound on the rank of the solution space may need a more careful 



The author first heard this argument for additive folding from Swastik Kopparty. 
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inspection of the structure of the system A (X) + A 1 (X)f(X) + A 2 {X) f(X + !) + ■■■ + A s (X)f(X + 
s - 1) = 0. 

Derivative codes. Continuing the theme of the previous remark, when char(F g ) > k, an analog of 
Lemma 6 for the differential equation 

A (X) + Ai(X)f(X) + A 2 (X)f'(X) + A 3 (X)f"(X) + ■■■ + A 8 (X)f^'- 1 \X) = 

is proved in [15] (here f'(X) denotes the derivative of / and f^'(X) the i'th derivative of /). This 
is then used in [15] to show that derivative codes over fields of large characteristic can also achieve 
list decoding capacity. That is, they allow list decoding a fraction 1 — R — e of errors with rate 
R, for a suitable choice of parameters. Independently, Bombieri and Kopparty [3] have given an 
algorithm for list decoding derivative codes up to a fraction »1- R s /( s + 1 ) G f errors using s + 1- 
variate interpolation, matching the performance of the author and Rudra's algorithm for folded 
RS codes [13]. 

Derivative codes (or univariate multiplicity codes) are the variant of Reed-Solomon codes where 
the i'th codeword symbol consists of not only the value /(a*) at the i'th evaluation point, but also 
the values of its first m — 1 derivatives (for some parameter m ^ 1). Over large characteristic, this 
is the same (up to some constant factors) as the residue of / mod (X — ai) m . Multivariate versions 
of multiplicity codes were studied in the recent work of Kopparty, Saraf, and Yekhanin [18] where 
they were used to give a surprising construction of codes of rate 1 — e locally decodable in 0(n 7 ) 
time for any e, 7 > 0. 

Multiplicities, soft decoding, and list recovery. For the linear interpolation of the form (4), using 
multiplicties in the interpolation stage, as in [14], only hurts the performance. This is because 
the degree of the Y/s cannot be increased to meet the needs of the larger number of interpolation 
conditions. Thus in order to get a good decoder than can handle soft information on reliabilities of 
various symbols [14, 17], one has to resort to the method behind the original algorithm in [13]. A 
weaker form of soft decoding is the problem of list recovery, where for each position % of the code 
the input is a set Si of up to I possible values, and the goal is to find all codewords whose z'th 
symbol belongs to S% for at least t values of i. For this problem, a straightforward extension of the 
method of Section 2.2 gives an algorithm that works for agreement fraction r = 4j satisfying 

I s mR 

t > 1 . 

s + 1 s + lm — s + 1 

The crucial point is that for any fixed I, by picking s « tje and m « £/e 2 , we can list recover with 
agreement fraction r = R + e — the agreement fraction required does not degrade with increasing 
£. Such a list recovery guarantee is very useful in list decoding concatenated codes, for example 
to construct binary codes list-decodable up to the Zyablov radius, or codes list-decodable up to 
radius 1 — R — e over alphabets of fixed size independent of n; see [13, Sect. V]. 

4 Improving list size via pseudorandom subspace-evasive subsets 

Based on Theorem 7, in this section we pursue one possible approach to improve the provable 
worst-case list size bound for list decoding up to a fraction 1 — R — e of errors. Instead of allowing 
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all polynomials fo + f\X H + fk-\X k ~ 1 of degree less than k as messages, the idea is to restrict 

the coefficient vector (fo, /i, . . . , fk-i) to belong to some special subset V C ¥ k , satisfying the 
following two conflicting demands: 

Largeness: The set V must be large, say |V| ^ q( 1 ~ £ ) k r so that the rate is reduced by at most a 
(1 — s) factor. 

Low intersection with subspaces: For every subspace S C ¥ k of dimension s, \S Pi V| ^ L. 

(Let us call this property of V as (s, L)-subspace-evasive for easy reference. The field ¥ q and 
ambient dimension k will be fixed in our discussion.) 

Using such a set V will ensure that after pruning the affine subspace output by the algorithm of 
Theorem 7, the number of codewords will be at most L. (Note that an affine subspace of dimension 
s — 1 is contained in a subspace of dimension s.) Thus the list size will go down from q s to L. 

Subspace-evasive subsets were used in [21] to construct bipartite Ramsey graphs, and in fact 
we borrowed the term evasive from that work. In their work, the underlying field was F2 and 
the subsets had to be evasive for dimension s pa k/2. Our interest is in a different (and hope- 
fully easier?) regime — we can work over large fields, and are interested in evasiveness w.r.t. 
s-dimensional subspaces for constant s. 

Subspace-evasive subsets are also connected to certain well-studied objects called affine extrac- 
tors; see the discussion at the end of this section. 

A random large subset of ¥ q meets the low subspace intersection requirement very well, as 
shown below. The argument is straightforward; a similar bound appears in [5] in the geometric 
context of point-subspace incidences. 

Lemma 8. Let W be a random subset of¥ q chosen by including each i6Fj in W with probability q~ s ~ a 
for some a > 0. Then with probability at least 1 — q~ n( - k \ W satisfies both the following conditions: (i) 
I W| ^ q k ~ s ~ a /2, and (ii) W is (s, 2sk / a)-subspace-evasive. 

Proof. The first part follows by a standard Chernoff bound calculation. For the second part, fix a 
subspace 5 C Fj of dimension s, and a subset T C S of size t = \2ks/a]. The probability that 
W 5 T equals By a union bound over the at most q ks choices for the s-dimensional 

subspace S, and the at most q st choices of i-element subsets T of S, we get that the probability that 

W is not (s, t — l)-subspace-evasive is at most q ks+st ■ g>-( s + a ) i ^ q~ ks since t ^ 2ks/a. □ 

Picking a ek, the above guarantees the existence of subsets W of F^ of size q( 1 ~ £ ) fc which 
are (s, 0(s/e))-subspace-evasive. Restricting the coefficient vector (/o,/i, • • • , fk-i) of the message 
polynomial to belong to such a subset will guarantee a list-size upper bound of O(sfe) in Theo- 
rem 7. This list-size bound is a constant independent of n, and for the choice s 1/e which enables 
list decoding a fraction 1 — R — e of errors, it is 0( 1/e 2 ). This is quite close to the bound of 0(1/ e) 
achieved by random codes [11]. 

Unfortunately, an explicit construction of subspace-evasive subsets with anywhere close to the 
trade-off guaranteed by the probabilistic construction of Lemma 8 is not known. This appears to 
be a challenging and extremely interesting question. One natural choice for such a subset would 
be some variety VCFj defined by a collection of polynomial equations, i.e., V = {a 6 Fj | 51(a) = 
52(a) = ••• = 57(a) = 0} for some polynomials g 1 ,g 2 ,...,gi G ¥ q [Zi, Z 2 , . . . , Z k ]. Indeed for s = 1 
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and s = k — 1, varieties in (the modular moment surface and modular moment curve) with low 
intersection with s-dimensional affine subspaces are known [5]. 

Connection to affine extractors. The problem of constructing subspace-evasive subsets is related 
to the well-studied problem of constructing affine extractors. An affine extractor is an M -coloring 
of F^ with the property that every s-dimensional affine subspace of F^ has between (l/M — 5)q s 
and (l/M + 5)q s elements belonging to each of the M color classes. Here 6 is the error and log 2 M 
is the number of output bits of the extractor. If we had an affine extractor with a large number of 
outputs (say M ^ q( 1_£ ) s for arbitrarily small constants e > 0) and very small error (S ^ 0(1/M), 
in other words a small relative error instead of an additive error), then the subset corresponding to 
a single color class will be subspace-evasive. 

Known explicit constructions of affine extractors fall short of meeting these requirements. The 
constructions in the literature either require large dimensions s (and therefore are not applicable in 
our setting of s = 0(1)), or have too large an error to be useful for us. For instance, the extractor of 
Gabizon and Raz [7], which works over large fields and any s ^ 1 (both aspects being perfect for 
us), has an error 5 « l/y 7 ?, due to the application of the Weil bounds on character sums. On the 
other hand, an extractor satisfies a stronger property than what is needed in a subspace-evasive 
subset, so we hope that good explicit constructions of subspace-evasive subsets will be easier to 
obtain. 

4.1 Pseudorandom construction of subspace-evasive subsets 

The construction of Lemma 8 takes exponential time and produces a random unstructured set that 
takes exponential space to store. In this section, we show that a subset with similar guarantees can 
be constructed in probabilistic polynomial time, producing a polynomial size representation of 
the constructed subspace-evasive set. The idea is to note that the probabilistic argument to argue 
about (s, i)-subspace-evasiveness only needed t-wise independence and not complete indepen- 
dence of different elements of F^ landing in the random subset W. 

Fix an arbitrary basis 1, /3, /3 2 , . . . , /3 fc_1 of F^ over ¥ q . Also denote IK = ¥ q k. For a polynomial 
P G K[X] and an integer r (1 ^ r ^ k), define the subset S(P, r) C ¥ k as follows: 

S(P,r) = {(a 0> ai I ...,a fc _ 1 ) €i* | P(a + + a 2 /3 + ■ ■ ■ + a k _ 1 /3 k ' 1 ) G F,-span(l, j 9, • • • ,/? r " 1 )}. 

Lemma 9. Let qbea prime power, k ^ lan integer, and denote K = ¥ q k. Let ( G (0, 1) and s be an integer 
satisfyingl < s ^ (k/2. Let P G K[X] be a random polynomial of degree t and define V = S(P, (l — ()k). 
Then, provided t ^ fi(s/C), with probability at least 1 — q~ n( - k ) over the choice of P, V is a (s, ^-subspace- 
evasive subset of¥q of size at least q( 1 ~^ k /2. 

Proof. For each x G F^, note that x G S(P, r) with probability q~^ k . Further, since the values of P 
at any t distinct points in IK are independent, the events x G S(P, r) for various x G F^' are t-wise 
independent. The argument in Lemma 8 only relied on the t-wise independence of these events, 
and therefore one can conclude that V = S(P, r) is (s, 0(s/£))-subspace-evasive with probability 
at least 1 - q ~ n ( ks ). 

The expected size of V is E[V] = q^~^ k . Since the events x G V are pairwise independent, by 
Chebyshev's inequality, Pr [| V | < E[V]/2] < 4/E[V]. Hence |V| > g (1 ~ ?)fc /2 except with probability 
at most q~ n ( k \ □ 
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Note that the set S(P, r) has a compact representation, and given P, membership in S(P, r) can 
be checked efficiently. In fact, it is easy to see that S(P, r) is a variety in Indeed, P(ao + 
ai/3 + • • • + a fc _i/3 fc_1 ) can be expanded out as p (a ,ai, . . . ,ak-i) + Pi(a , . . . , Ofc_i)/S + • • • + 
Pfc-l(a , . . . , afe-i)^ -1 for polynomials p ,Pi, • • • ,Pfc-i G FjZi, Z 2 , . . . , Z k ], and therefore 

®(P,r) = {a = (o ,oi,.. .,a k -i) G F^ | p r (a) =p r+1 (a) = • • • =p fc _i(a) = 0} . (13) 

Combining this with Theorem 7, we can conclude the following. 

Theorem 10. For any ( > 0, £/zere is a Monte Carlo construction of a subcode C of FRSg [n, cow- 
sisting of encodings of polynomials whose coefficients belong to a variety V C F*, suc?z £/za£ an'tfz ftig/i 

probability C has rate at least (1 — Qk/n and can be list decoded from a fraction ^1 — m ^. 1 l /or any 

1 < s < m in q°^ time with an output list size of at most 0(s/(). 

In particular, picking ( = 0(e), s = 0(l/e) and m = 9(l/e 2 ), the construction yields codes of rate 
R which can be list decoded from a fraction 1 — R — e of errors in polynomial time, with at most 0(1/ e 2 ) 
codewords output in the list. 

Encoding complexity. In the above construction, the code can be succinctly stored and member- 
ship in the code efficiently tested. However, we do not know a way to output the i'th codeword 
in the code (i.e., to perform encoding) in polynomial time. We now show that efficient encoding 
can also be achieved if we settle for a list size of O(k) (which is still much better than the q^ 1 /^ 
bound). 

The idea is to apply Lemma 9 with the parameter choice ( = 2s /k and t = 0(k), and taking V = 
§(P, k — 2s) for a random degree t polynomial P. Now with very high probability over the choice 
of P, standard tail inequalities for t-wise independent random variables (eg. [2]) imply that for 
every choice of /o, fi, . . . , fk-3s-lt there are at least q s /2 elements (ao, ai, • • • , a-k-i) G S(P, r) such 
that <2j = fi for ^ i < k — 3s. In particular such a A;-tuple can be found in q°^ time by searching 
over all possible values of (ak-3 S , • • • , afc-i)- We can use an arbitrary such tuple (a' k _ 3s , . . . , a' h _]) 
(say the lexicographically smallest) as the 3s highest degree coefficients and encode the message 
(/o, /i, . . . , / fc - 3s -i) G F^ 3s by the folded RS encoding (1) of the polynomial f + f x X + • • • + 
/fc-3s-i^ fc_3s_1 + a 'k-3 S X k ~ 3s + • • • + o,' k _ l X k ~ 1 . Note that we only purge 3s symbols from ¥ q in 
the messages so the rate is R — o(l). The list decoder can simply discard the top (highest degree) 
3s coefficients of any recovered polynomial to find the actual message tuple. 

One obvious open question raised by the above is to construct the claimed variety (even with a 
somewhat worse list size guarantee) explicitly. This would make the code explicit, and if the vari- 
ety is sufficiently well-structured, also imply a nice encoding function. Even more exciting would 
be to construct a subspace-evasive subset for which the intersection with an s-dimensional sub- 
space can be computed efficiently, in time polynomial in the size of the intersection. This would 
avoid the need for the q s runtime bottleneck arising from exhaustively checking all candidates in 
the subspace for membership in the variety. 

One point worth noting is that the degree of each of the polynomials pi G ¥ q [Zi, Z2, ■ ■ ■ , Z k ] 
defining the variety (13) is Q(s/C) and there are Qk of them, so bounding the size of the variety 
by the product of the degrees via Bezout's theorem would lead to uselessly large bounds. Even 
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the existence of a variety cut out by say O(s) polynomials each with degree at most O(s) that is 
(s, s°( s ))-subspace-evasive does not appear to be known. 
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